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IMPROVED CLONING VECTOR CONTAINING MARKER 
INACTIVATION SYSTEM 

FIELD OF THE INVENTION 

This invention relates to cloning systems using 
marker inactivation for the identification of 
5 recombinants containing the insertion of a nucleic acid 
molecule. More particularly , the present invention 
relates to lacZoi gene fragments having improved accuracy 
and reliability in detecting the insertion of a nucleic 
acid molecule therein. 

10 

BACKGROUND OF THE INVENTION 

The industrial applications of genetic engineering 
are becoming evident in the production of 
pharmaceuticals, of foods having improved properties, 

15 and of chemical products (including enzymes) to 

facilitate manufacturing processes. The process of 
genetic engineering may begin by cloning a gene of 
interest which encodes a protein with the desired 
properties for the particular industrial application. 

20 Typically, cloning a gene is done by either breaking up 
a genome into manageable sized fragments, or generating 
cDNA fragments from isolated mRNA, and then cloning 
those genomic or cDNA fragments into a vector and 
introducing the resultant recombinant vectors into a 

25 competent host cell. Commonly used methods for screening 
transf ormants, to identify a transf brmant that contains 
a recombinant vector with a nucleic acid molecule 
inserted therein, include marker inactivation systems, 
including marker inactivation systems which utilize 

3 0 various indicator reporter genes including lacZ^or 
IelcZol, galK, the gene for chloramphenicol 
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acetyl transf erase, , the gene for the green fluorescent 
protein (GFP) and mutant forms thereof (see Cubitt et 
al, 1995, Trends in Biochem. 20:448-455), the gene for 
lucif erase and the like; and positive selection systems 
5 which utilize lethal genes including ccdB (Bernard et 
al., 1994, Gene 148:71-74), the gene for mouse 
transcription factor GAT A- 1 (Trudel et al . , 1996, 
BioTechniques 20:684-693), the gene for thymidine 
kinase, the gene for /8- lactamase and the like. 

10 The lac operon marker inactivation system, is 

employed in one of the most widely used color selection 
systems for plasmids and single- stranded DNA (ssDNA) 
vectors (see, e.g., Messing et al . , 1977, Proc. Natl. 
Acad. Sci. USA 74:3642-3646; Messing et al . , 1981, Nucl . 

15 Acids Res. 9:309-321; Messing, 1983, Methods Enzymol . 

101:20-78; and Yanisch- Perron et al . , 1985, Gene 33:103- 
119) . Essentially, the lac operon marker inactivation 
system functions by intracistronic complementation 
between the a-peptide encoded by the lacZct gene 

20 fragment, and a jS-galactosidase molecule that most 

commonly carries a deletion of amino acids 12 through 
42. 

lacZot is a gene fragment, comprising the proximal 
portion of the Escherichia coli lacZ gene, which encodes 

25 approximately 60 of the amino terminal amino acids of 
the /8-galactosidase polypeptide chain. The encoded 
product, the "or-peptide" , complements the defective 
activity of the gene product of 2acZM15, an allele that 
carries a spontaneous deletion of the codon for amino 

30 acids 12 through 42 of 0-galactosidase . Thus, to 

identify a transformant that contains a recombinant 
vector with a nucleic acid molecule inserted therein, 
vector having a cloning site in the lacZoi gene fragment 
is introduced into a host cell expressing a 

35 jS-galactosidase having a deletion of amino acids 12 

through 42. Transf ormants, presumably containing vector 
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carrying an intact lacZa gene fragment, produce blue 
colonies or plaques when applied onto media containing a 
chromogenic 0-galactosidase substrate. This is because 
functional /3-galactosidase activity is achieved by 
5 complementation between the ar-peptide and a 

/3-galactosidase molecule carrying the deletion, thereby 
cleaving a chromogenic substrate such as 5-bromo-4- * 
chloro-3-indolyl-/3-D-galactoside ("X-gal") to produce 
deep blue dibromodichloroindigo . In contrast, 
10 transf ormants containing vector carrying a lacZot gene 
fragment having an insertion produce colorless (white) 
colonies or plaques when similarly plated. Colorless 
colonies result when the inserted nucleic acid molecule 
interrupts expression of the lacZot gene fragment so that 
15 the complementing ar-peptide is not produced. 

Currently, all lacZcv-based vectors {e.g. Messing et 
al., 1977, supra; Yanisch- Perron et al., 1985, supra; 
Guan et al . , 1987, Gene 67:21-30; Short et al . , 1988, 
Nucl. Acids Res. 16:7583-7600; Al ting -Mees and Short, 
20 1989, Mid. Acids Res. 17:9494; Evans et al . , 1995, 

Bioteclmiques 19:130-135; and U.S. Patent No. 4,766,072) 
employ the same mechanism for color selection. This 
mechanism involves placement of restriction sites for 
insertion of a nucleic acid molecule upstream of the 
25 codon for amino acid 7 of 0-galactosidase, wherein the 
inserted nucleic acid molecule ("insert") results in 
interference with the expression, but not the activity, 
of the lacZ ce-peptide. The current marker inactivation 
configuration has the disadvantage in that problems 
3 0 arise in the detection of recombinant molecules. More 

specifically, false positives (white colonies or plaques 
containing vector not having an insert) and false 
negatives (colored colonies or colored plaques 
containing vector that have an insert) may be generated 
35 (see, e.g., Messing, 1983, supra; unpublished 
observations; and Table 2 herein) . 
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Although false positive results are' difficult to 
eliminate owing to the fact that they arise to a large 
extent out of factors which are extraneous to the 
selection system itself, these do not generally 
5 constitute a problem since they are selected alongside 
actual positives and are subjected to further scrutiny • 
before their fate is decided. Among the external 
factors responsible for generating false positives are 
(i) contamination of restriction and modification 
10 enzymes with exonucleases , polymerases or other 

restriction enzymes; (ii) spontaneous mutations; and 
(iii) loss of the F' episome carrying the lacZMlS 
allele. 

False negatives, on the other hand, represent a 
15 problem as they are rarely carried forward for further 
examination and, as a result, are responsible for 
numerous erroneous conclusions. Such erroneous 
conclusions include, at least in. part, the general 
phenomenon referred to as "non-clonable sequences" , and 
20 the presence of an excessive number of gaps in shotgun 
DNA sequencing results. False negatives are caused by 
both extrinsic factors, as well as factors which are 
intrinsic to the architecture of the. color selection 
mechanism itself. In the currently available 
25 lacZa-based vectors, there are two principal causes of 
false negatives: (i) in-frame insertion of DNA 
fragments containing one or more open reading frames; 
and (ii) reinitiation of translation within the mRNA 
transcribed from the inserted DNA fragment at any in- 
3 0 frame AUG, GUG or even UUG and CUG preceded by a pseudo 
Shine -Delgarno box. Events arising out of either of 
these two instances result in the synthesis of 
of-peptides bearing aminoterminal fusions. Since neither 
amino nor carboxyterminal fusions to the a-peptide 
35 usually impair its activity (see, e.g., Slilaty et al . , 
1990, Eur. J. Biochem. 194:103-108) blue colonies or 
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blue plaques indistinguishable from those colonies or 
plaques produced by vectors not carrying an insert are 
formed. The number of false negatives produced in like 
manner is further augmented by the fact that even the 
5 less frequent fusions, having diminished levels of 

or-peptide activity, produce blue colonies or blue t 
plaques due to the hypersensitivity of the X-gal assay 
system. The hypersensitivity of the X-gal system 
represents the fact that very little jS-galactosidase 
10 activity is needed for a complete color-producing 
reaction to take place. 

Hypersensitivity of the X-gal assay system is also 
responsible for another source of false negatives. This 
source of false negatives arises as a result of 
15 /3-galactosidase-like activity produced by the eJbgr locus 
of the host cell. The eJbg (evolved /3-galactosidase) 
operon is located directly across the chromosome from 
lacZ and codes for an enzyme that has low level 
0-galactosidase-like activity (Hall et al . , 1989, 
20 Genetics 123:635-648) . In wild-type strains, this 

enzyme does not have enough activity to allow growth on 
lactose. However, in typical screening protocols, host 
cells suspected of being transf ormants are grown in the 
presence of an inducer of IslcZol gene expression. In 
25 such circumstances, the enzyme typically having a low 

level j3-galactosidase-like activity has enough activity 
in the presence of such inducers (e.g., isopropyl 
thiogalactoside or "IPTG") to cleave the chromogenic 
substrate X-gal, thus yielding bluish colonies, or more 
3 0 frequently white colonies with blue centers (unpublished 
observations) . The effects of the eJbg locus on blue 
color formation, in colonies that otherwise would be 
white, may be minimized by avoiding long incubation 
periods of plated cells (less than 18 hours) , or 
35 completely eliminated by using hosts carrying a 
defective eJbg locus. 
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Thus, there is a need for a cloning ' vector 
utilizing the lacZa marker inactivation system, wherein 
the cloning vector is based on a configuration which 
minimizes the generation of false negatives. Such a 
5 novel cloning vector allows for improved accuracy and 

reliability in detecting the inactivation of the lacZot • 
gene fragrnent caused by insertion of a nucleic acid 
molecule. The novel cloning vector may be used for 
general cloning purposes, as well as, for gap- free 
10 shotgun sequencing, in facilitating industrial 

applications of gene isolation, genetic engineering and 
development of ordered genomic libraries. 

SUMMARY OF THE INVENTION 
15 in accordance with the present invention, disclosed 

is a marker inactivation system which utilizes lacZa in 
a configuration which minimizes the generation of false 
negatives during screening processes for recombinant 
clones . 

20 In the development of the vector according to the 

present invention, it was an unexpected result to find 
that accurate and reliable inactivation of IslcZol occurs 
only when a nucleic acid molecule is inserted in the 
region of the IslcZol 

25 gene fragment that encodes amino acids 8 to 38 of 

/J-galactosidase. Thus, of the amino acids encoded by a 
lacZa gene fragment, residues corresponding to amino 
acids 8 to 38 of /S-galactosidase have been found to be 
required for functional a-peptide activity for 

30 complementation in vivo. 

Thus, in one embodiment of the present invention, 
the vector has at least one promoter operatively linked 
to a DNA sequence encoding an a-peptide, wherein the 
resultant or- peptide is capable of complementation with a 

35 defective jS-galactosidase molecule (e.g. one that 

carries a deletion of the amino acids 12 through 42) 
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thereby resulting in /3-galactosidase activity. At least 
one cloning site, and preferably multiple cloning sites 
cleaved by distinct restriction enzymes, is included 
within the region of the DNA sequence encoding the 
5 a- pep tide, wherein the region corresponds to the DNA 

encoding amino acids 8 to 38 of /8-galactosidase as shown 
in SEQ ID NO : 1 . As appreciated by one skilled in the 
art from the disclosure of the present invention, 
modifying the wild type IslcZol gene fragment to encode 

10 functional of-peptides having altered codons as well as 
conservative and/or nonconservative substitutions 
included within, but not limited to, the region of amino 
acids 8 to 38 of j3-galactosidase, can produce DNA 
sequences with one or more restriction enzymes sites 

15 contained therein. Additional embodiments of the 

present invention include the inclusion in the vector of 
other features useful for protein expression and other 
molecular manipulations including, but not limited to, 
DNA sequences selected from the group consisting of one 

20 or more antibiotic resistant genes or auxotrophic genes 
to aid in selection of recombinants, a ribosome binding 
site, regulatory elements, at least one origin of 
replication ( "replicon" ) , a transcription terminator, at 
least one phage promoter, a phage origin of replication . 

25 and combinations thereof. Those skilled in the art will 
recognize that the teachings provided herein can readily 
be applied to indicator marker, reporter, or positive 
selection genes other than lacZ or lacZot to produce 
cloning vectors which minimize the generation of false 

30 negatives during screening processes for recombinant 
clones as detailed herein for lacZot. 

A preferred plasmid vector constructed in 
accordance with the present invention, designated 
pTrueBlue™, was constructed using commercially available 

35 plasmids, and using standard methods known to those 
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skilled in the art . including restriction enzyme 
digestion, and site-directed mutagenesis. ( . 

A preferred phage vector constructed in accordance 
with the present invention, designated Ml 3TrueBlue - BAC™ , 
5 was constructed using commercially available phage, and 
using standard methods known to those skilled in the art 
including restriction enzyme digestion, and site- 
directed mutagenesis. 

A preferred bacterial artificial chromosome vector 
10 constructed in accordance with the present invention, 

designated TrueBlue™, was constructed using commercially 
available vector and using standard methods known to 
those skilled in the art including enzyme digestions and 
ligations. 

15 The vector according to the present invention is 

utilized by cleaving the vector with at least one 
restriction enzyme that is specific to at least one 
selected restriction site which has been introduced in 
the region corresponding to the DNA encoding amino acids 

20 8 to 3 8 of 0-galactosidase as illustrated in SEQ ID 

N0:1. A nucleic acid molecule is then cloned into the 
cleaved vector. The resultant recombinant vectors are 
introduced into competent host cells, and transformed 
host cells are then selected for and screened by growth 

25 in the presence of a chromogenic substrate (e.g., X-gal 
or MacConkey agar) which can be acted upon by 
/3-galactbsidase . Clones containing vector carrying an 
intact IclcZoi gene fragment produce colored colonies or 
plaques when grown in the presence of media containing a 

30 chromogenic 0-galactosidase substrate. Clones 

containing vector carrying a lacZot gene fragment 
according to the present invention and having an 
insertion therein produce colorless (white) colonies or 
plaques when similarly plated. 

35 In a further embodiment of the plasmid vector 

according to the present invention, the plasmid vector 
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has been designed to provide capabilities for in vitro 
preparation of RNA probes, creation of nested deletions 
through ExoIII protection sites, manipulation of large 
DNA inserts via sites for 8 -base cleaving restriction 
5 enzymes, preparation of ssDNA, and protein expression. 

These and other objects, features, and advantages 
of the present invention will become apparent from th£ 
following drawings and detailed description. 

10 BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 illustrates the amino acid sequence of the 
of -pep tide and some of the possible palindromes or 
restriction enzyme sites which may be introduced into a 
region of the IslcZol coding sequence. 

15 FIG. 2A is a schematic illustration of an embodiment of 
a plasmid construct according to the present invention. 
FIG. 2B is an enlarged view of a region contained within 
the plasmid construct shown in FIG. 2A, illustrating 
multiple cloning sites within a region of the IslcZol 

20 coding sequence (see bracket labeled "Color Selection 

Cloning Sites" and various other features; see also SEQ 
ID NO : 7 ) . 

FIG. 3A is a schematic illustration of an embodiment of 
a phage construct according to the present invention. 

25 FIG. 3B is an enlarged view of a region contained within 
the phage construct shown in FIG. 3A, illustrating 
multiple cloning sites within a region of the IslcZol 
coding sequence (see bracket labeled "Color Selection 
Cloning Sites"; see also, SEQ ID N0:10) . 

30 FIG 4A is a schematic illustration of a bacterial 

artificial chromosome vector according to the present 
invention. 

FIG 4B is an enlarged view of a region contained within 
the bacterial artificial chromosome vector shown in Fig 
35 4A, illustrating multiple cloning sites within a region 
of the IslcZol coding sequence (see bracket labeled "Color 
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Selection Cloning Sites" and various other features; see 
also SEQ ID NO: 11 • 

DETAILED DESCRIPTION OF THE INVENTION 
5 Definitions 

By the term "a-peptide" , is meant, for the purposes 
of the specification or claims, a peptide that is 
capable of complementing a defective /8-galactosidase 
molecule (e.g. one having a deletion of amino acids 12 

10 through 42, or amino acids 24-32) such that functional 
/3-galactosidase activity is achieved. While the 
or-peptide typically used in vivo comprises the first 60 
amino acids of the amino terminus of the /J-galactosidase 
molecule, an a-peptide may comprise more or less than 60 

15 amino acids. For example, the minimal purified peptide 
fragment capable of of- complementation in vitro 
encompasses a peptide of 39 amino acids comprising amino 
acids 4 to 42 (Welply et al . , 1981, J. Biol. Chem. 
256:6804-6810). Longer fragments of /3-galactosidase, 

20 including theoretically the full-length j3-galactosidase 
chain, are also functional as a-peptides (e.g. Slilaty 
et al., 1990, supra). Additionally, the a-peptide may 
contain conservative substitutions of the amino acid 
sequence shown in SEQ ID NO:l. A conservative 

25 substitution of one or more amino acids are such that 
the folding of the a -peptide, and the ability for 
a- complementation, are substantially unchanged. 
"Conservative substitutions" is defined by 
aforementioned function, and includes substitutions of 

3 0 amino acids having substantially the same charge, size, 
hydrophilicity , hydrophobicity , and/or aromaticity as 
the amino acid replaced. Such substitutions, known to 
those of ordinary skill in the art, include, but are not 
limited to glycine -alanine -valine; isoleucine- leucine ; 

3 5 tryptophan- tyrosine; aspartic acid- glutamic acid; 

arginine- lysine; asparagine-glutamine; and serine- 
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threoninel Also, the a-peptide may contain 
nonconservative substitutions of the amino acid sequence 
shown in SEQ ID NO:l. A "nonconservative substitution" 
is defined as the substitution of any one amino acid for 
5 any one or more amino acids such that the a -peptide 
retains the ability for a- complementation and color 
production in cloning processes. Nonconservative 
substitutions are known in the art for the ar-peptide, as 
described in Dunn et al. (1988 , Protein Engineering 
10 2:283-291; herein incorporated by reference), and may be 
produced using mutagenic procedures such as described 
herein. 

By the terms "lacZoi", "lacZot gene fragment" or 
" lacZo. coding sequence" is meant, for the purposes of 

15 the specification or claims to refer to a DNA sequence 
which encodes an a-peptide as defined above. In that 
regard, and as appreciated by those skilled in the art, 
because of codon and third base degeneracy, almost every 
amino acid can be represented by more than one triplet 

20 codon in a coding nucleotide sequence. Thus, there are 
multiple sequences comprising a lacZa coding sequence, 
which when compared to each other are modified slightly 
in sequence (e.g., substitution of a nucleotide in a 
triplet codon) , and yet still encode the a-peptide. By 

25 the term "modified lacZo. gene fragment" or "modified 

lacZot coding sequence" is meant, for the purposes of the 
specification or claims to refer to a DNA sequence which 
encodes an a-peptide, and which contains one or more 
cloning sites introduced into and contained in the 

3 0 coding sequence for or-peptide amino acids corresponding 
to amino acid 8 and downstream of amino acid 8 and 
particularly corresponding to amino acids 8 to 38 of 
j8-galactosidase . 

By the term " j3-galactosidase" is meant, for the 

35 purposes of specification and claims, to refer to wild- 
type or naturally occurring /J-galactosidase enzyme 
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encoded by the lacZ gene of E.coli and other bacteria. 
In that regard and for the purpose of specification anc| 
claims, all references to codon or amino acid numbers in 
connection with the a?-peptide, lacZot, lacZot gene 
5 fragment, lacZot coding sequence, or modified lacZot 
coding sequences are to codons or amino acids that 
correspond to their counterparts in the standard lacZ 
gene or the wild- type /3-galactosidase sequences. 

By the term "six -base palindrome" is meant for the 
10 purposes of specification and claims to refer to a 

double -stranded DNA sequence of six nucleotides that is 
the same when either of the strands are read in a 
defined direction. Thus, six base palindrome includes 
any of the following 64 possible sequences: 

15 

AAATTT, AACGTT, AAGCTT, AATATT, ACATGT, ACCGGT, ACGCGT, 
ACTAGT, AGATCT, AGCGCT, AGGCCT, AGTACT, ATATAT, ATCGAT, 
ATGCAT, ATTAAT, CAATTG, CACGTG, CAGCTG , CATATG, CCATGG, 
CCCGGG, CCGCGG, CCTAGG , CGATCG, CGCGCG, CGGCCG , CGTACG, 

2 0 CTATAG, CTCGAG, CTGCAG, CTTAAG , GAATTC , GACGTC, GAGCTC, 

GATATC, GCATGC, GCCGGC , GCGCGC, GCTAGC, GGATCC, GGCGCC, 
GGGCCC , GGTACC , GTATAC , GTCGAC , GTGGAC, GTTAAC, TAATTA, 
TACGTA, TAGCTA, TAT AT A , TCATGA, TCCGGA, TCGCGA, TCTAGA, 
TGATCA, TGCGCA, TGGCCA, TGTACA, TTATAA, TTCGAA, TTGCAA, 
25 TTTAAA 

By the term "operably linked" is meant, for the 
purposes of the specification and claims to refer to the 
chemical fusion, ligation, or synthesis of DNA such that 
a promoter-DNA sequence combination is formed in a 

3 0 proper orientation and reading frame for the DNA 

sequence to be transcribed into functional RNA and 
expressed as a protein or a peptide. Transcription from 
the promoter-DNA sequence may or may not be regulated by 
the promoter, and posisibly in combination with other 
35 regulatory elements. In the construction of the 

promoter-DNA sequence combination, it is generally 
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preferred to position the promoter at a distance 
upstream from the initial codon of the DNA sequence that 
is approximately the same as the distance between the 
promoter and the gene it controls in its natural 
5 setting. However, as known in the art, substantial 
variation in the distance can be accommodated without 
loss of promoter function. 

By the term "DNA molecule" is meant, for the 
purposes of the specification and claims to refer to any 

10 nucleic acid sequence including, but hot limited to, a 
gene or a gene fragment, natural or synthetic DNA, 
coding or noncoding DNA, DNA complementary to RNA and so 
on. The expressed proteins or peptides may include 
biologically- active, and/or commercially valuable 

15 molecules known to those skilled in the art. 

By the term "introduction" when used in reference 
to a host cell is meant, for the purposes of the 
specification and claims to refer to standard procedures 
known in the art for introducing 

20 recombinant vector DNA into the target host cell. Such 
procedures include, but are not limited to, 
transf ection, infection, transformation, natural uptake, 
and electroporation. 

By the term "promoter" is meant, for the purposes 

25 of the specification and claims to refer to a nucleotide 
sequence, natural or synthetic, capable of binding RNA 
polymerase to initiate transcription. Such promoters 
are known to those skilled in the art and may include 
bacterial, yeast, viral, eukaryotic or mammalian 

30 promoters, the selection of which depends on the host 
cell system used for expression. 

By the term "regulatory element" is meant, for the 
purposes of the specification and claims to refer to 
control elements for efficient gene transcription or 

35 message translation including, but not limited to, 

enhancers, and transcription or translation initiation 
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and termination signals. Enhancer sequences are DNA 
elements that appear to increase transcriptional 
efficiency in a manner relatively independent of their 
position and orientation with respect to a nearby gene. 
Thus, depending on the host cell expression vector 
system used, an enhancer may be placed either upstream • 
or downstream from the inserted DNA molecule to increase 
transcriptional efficiency. Such regulatory elements 
may be inserted into nearby vector DNA sequences using 
recombinant DNA methods known in the art for insertion 
of DNA sequences. 

By the term "vector" is meant, for the purposes of 
the specification and claims to refer to a DNA molecule 
capable of autonomous replication in a host cell, and 
which allow for cloning of DNA molecules. As known to 
those skilled in the art, a vector includes, but is not 
limited to, a plasmid, cosmid, phagemid, viral vectors, 
phage vectors, yeast vectors, mammalian vectors and the 
like. 



In the preferred and illustrated embodiments, the 
vector according to the present invention comprises at 
least one promoter operably linked to a DNA sequence 
encoding an a-peptide; and one or more cloning sites 
25 cleavable by distinct restriction enzymes which have 

been introduced within the lacZ coding sequence from and 
including codon 8 and downstream of codon 8, in forming 
a modified lacZoi coding sequence. Preferably, the 
modified lacZot coding sequence contains restriction 
30 enzyme sites in a region of the DNA sequence encoding 

the ce-peptide, wherein the region corresponds to the DNA 
encoding amino acids 8 to 38 of 0-galactosidase as shown 
in SEQ ID NO:l. Various bacterial, phage, or plasmid 
promoters known in the art from which a high level of 
35 transcription has been observed in a host cell system 

such as E. coli include, but are not limited to, the lac 
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promoter, trp promoter, tac promoter, recA promoter, 
ribosomal RNA promoter, the P R and P L promoters, T7 
promoter, SP6 promoter, lacUV5 , ompF, tola, and 1pp. 
Various prokaryotic replicons are known to those skilled 
5 in the art, and function to direct autonomous 

replication and maintenance of a recombinant molecule, 
of which it is part of, in a prokaryotic host cell. The 
vector may further comprise selection means such as an 
antibiotic resistance gene, or a gene that complements 

10 an auxotroph. Various antibiotic resistance genes have 
been incorporated into vectors for the purpose of aiding 
selection of host cell clones containing such vectors. 
For example, antibiotic resistance genes incorporated 
into vectors intended for introduction into bacterial 

15 host cells include, but are not limited to, a gene that 
confers resistance to an antibiotic selected from the 
group consisting of ampicillin, kanamycin, tetracycline, 
neomycin, G418 and chloramphenicol. Genes for 
complementing an auxotroph are genes encoding enzymes or 

20 proteins which facilitate usage of nutritional or 

functional components by the host such as a purine, 
pyrimidine, amino acid (e.g., lysine, tryptophan, 
histidine, leucine, cysteine), or sphingolipid. 

As appreciated by those skilled in the art, another . 

25 embodiment of the vector according to the present 

invention includes at least one promoter operatively 
linked to a DNA sequence encoding an a-peptide; multiple 
cloning sites cleavable by distinct restriction enzymes 
which have been introduced within the la.cZ coding 

30 sequence including codon 8 and downstream of codon 8, in 
forming a modified lacZoL coding sequence; and a replicon 
that functions in eukaryotic cells, in one illustration 
of this embodiment, the modified lacZa. coding sequence 
contains restriction enzyme sites in a region of the DNA 

35 sequence encoding the a-peptide, wherein the region 

corresponds to the DNA encoding amino acids 8 to 38 as 
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shown in SEQ ID NO:l. Those skilled in the art will 
recognize that other regions downstream of amino acid 38 f 
may be identified by using the methods described in the 
present invention. Various promoters for expression in 
5 eukaryotic cells are known in the art, including, but 

not limited to, viral or viral -like basal promoters like' 
the SV40 late promoter, the RSV promoter, the CMV 
immediate early promoter, and a VL30 promoter; and yeast 
or mammalian cellular promoters (See, e.g., Larsen et 
10 al., 1995, Nucleic Acids Res. 23:1223-1230; Donis et 

al., 1993, BloTechniques 15:786-787; Donda et al., 1993, 
Mol. Cell. Endocrinol. 90:R23-26; and Huper et al . , 
1992, In Vitro Cell Dev. Biol. 28A:730-734) . Various 
replicons are known to those skilled in the art that 
15 function in eukaryotic cells to direct replication and 
maintenance of a recombinant molecule, of which it is 
part of, in a eukaryotic host cell. The vector may 
further comprise selection means such as the use of 
thymidine kinase gene, an antibiotic resistance gene, or 
20 a gene that complements an auxotroph. Various 

antibiotic resistance genes have been incorporated into 
vectors for the purpose of aiding selection of 
eukaryotic host cell clones containing such vectors . 
For example, antibiotic resistance genes incorporated 
25 into vectors intended for introduction into eukaryotic 
host cells include, but are not limited to, a gene that 
confers resistance to an antibiotic selected from the 
group consisting of neomycin, and blastocidin S. For 
the lacZat marker inactivation system to work in 
30 eukaryotic cells, it is important to note that the host 
cell must also be engineered to express a 
0-galactosidase molecule to be complemented by the 
a- peptide, or the remainder of the lacZ gene be included 
along with the lacZot coding sequence on the same vector. 
35 However, successful expression and detection of the 

prokaryotic enzyme /3~galactosidase in eukaryotic cells 
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has been described previously (see, e.g., Rocha et al . , 
1996, Br. J. Cancer 74:1216-22), including substrates 
and suppression of endogenous activity (see, e.g., 
Hendrikx et al . , 1994, Anal. Biochem. 222:456-60; Young 
5 et al., 1993, Anal. Biochem. 215:24-30). 

Additionally, the vector according to the present 
invention may be sold in kit form. The kit comprises as 
a component the vector in sufficient amounts to perform 
multiple cloning reactions, and further comprises a 

10 component selected from the group consisting of host 

cells into which the recombinant vector is introduced, a 
chromogenic substrate (e.g., X-gal or MacConkey agar) , 
an inducer of lacZoi gene expression (e.g., IPTG) , and 
one or more restriction enzymes specific for restriction 

15 enzyme sites within the modified lacZot coding sequence, 
and combinations thereof. 

EXAMPLE 1 

Illustrated in this example are methods and 

20 compositions for construction of one embodiment of a 

plasmid vector according to the present invention. The 
starting plasmid selected for vector construction was 
plasmid pBluescript II KS ( - ) (Short et al . , 1988, supra; 
Alting-Mees and Short, 1989, supra). The initial step in 

25 construction involved removal of the 173 -base pair 
multiple cloning sites from pBluescript II KS(-) to 
generate a progenitor plasmid for use in subsequent 
manipulations. This was accomplished by cleavage of 
pBluescript II KS ( - ) with BssHII and religation at low 

30 DNA concentration (5 ng//xl) to generate the plasmid 
pSNS416 which contains, sequentially, the ampicillin 
resistance gene, the colEl origin of plasmid replication 
and an out of frame lacZot gene fragment 
(promoter/operator and first 60 codons of lacZ with a 

35 10-base substitution for codon 6) followed, in the 

opposite orientation, by the fl origin of replication 



WO 98/50566 



I 



PCT/US98/08854 



- 18 

which, in this configuration, allows for packaging of 
the antisense strand of lacZot into phage particles upon 
co- infection with a helper phage such as M13K07 (Vieira 
and Messing, 1987, Methods Enzymol. 153:3-11) . 
5 To address the problem of unreliability of the 

color selection mechanism in the currently used vectors; 
a modified lacZa gene fragment was constructed in which 
restriction sites recognized by various restriction 
enzymes were introduced along the entire length of the 

10 coding sequence of lacZot. This construction, having 

multiple restriction enzyme sites, allowed investigation 
of the mechanism of color selection as a function of not 
only a-peptide expression, but also of of-peptide 
complementation function. The strategy illustrated in 

15 this embodiment for engineering these modifications into 
IslcZol involved saturation of the wild type coding 
sequence with restriction enzyme sites by introducing 
base pair changes which resulted in creating the desired 
restriction enzyme sites but did not affect the coding 

20 specificity of the DNA (e.g., utilizing codon and third 
base degeneracy) . However, using the methods according 
to the present invention and thus encompassed within the 
scope of the present invention, base pair changes may be 
made which affect the coding specificity of the wild 

25 type IclcZol DNA and which result in either conservative 
or nonconservative amino acid substitutions that do not 
affect <x-peptide complementation function in vivo. 

Computer aided designs for implementation of this 
strategy were generated by employing a commercially 

30 available computer software. Initially, the proximal 60 
amino acids of /3-galactosidase, as specified by the 
shortest lacZ gene fragment known to be sufficient for 
providing a- complementation function (Yanisch- Perron et 
al., 1985, supra), were back translated into an 

3 5 ambiguous DNA sequence using the software's back 

translation function (see, e.g., FIG. 1) . SEQ ID N0:i 
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shows th^ first 60 amino acids of |3 -galactosidase for 
which an ambiguous DNA sequence was computer generated. 
Next, using ambiguous DNA sequences, a listing was 
generated of all possible restriction sites useful for 
5 cloning which may be introduced into a DNA sequence 

encoding the amino acid sequence of SEQ ID NO:l without 
affecting the amino acid sequence of SEQ ID NO:l. Since 
most cloning experiments require specific DNA termini, 
and since the restriction enzymes most useful for 

10 generating such termini in vectors are those that 

recognize six-base uninterrupted palindromes, preferred 
restriction enzyme sites for the vectors according to 
the present invention are those that recognize six-base 
uninterrupted palindromes. Of the 64 theoretically 

15 possible six-base palindromes, greater than 30 

restriction enzyme sites recognized by known restriction 
enzymes were identified in DNA sequences encoding the 
amino acid sequence of SEQ ID NO:l, as shown in the 
example in Figure 1. It is appreciated by those skilled 

20 in the art that such restriction enzyme sites may be 
other than six-base uninterrupted palindromes. For 
example, procedures similar to this can be performed for 
eight-base palindromes or other groups of restriction 
enzyme sites, depending on the desired cloning 

25 applications . 

FIG. 1 illustrates the results of the design 
strategy showing some of the possible restriction enzyme 
sites introducible into the IslcZol coding sequence. 
Note, however, that FIG. 1 is presented for purposes of 

30 illustration, and not limitation. For example, it is 
appreciated by those skilled in the art that other 
restriction sites may be created by means including 
introduction of codons in the region (codons 8 to 38) of 
the lacZokf coding sequence which encode conservative 

35 amino acid substitutions such as Leu for lie, Ala for 
Val, Ser for Thr and vice versa; or which encode 
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nonconservative substitutions by screening for or- peptide 
complementation activity from a randomly generated 
library of sequences (see, e.g., Dunn et al . , 1988, 
supra) . Several criteria were used to choose which 
5 restriction enzyme sites to introduce along the entire 
length of the coding sequence of lacZot. These criteria 
included commercial availability of the respective 
restriction enzyme at the time the work was performed, 
occurrence in the vector, and spacing and nature and 

10 compatibility of the termini. Based on these criteria, 
a subset of 13 restriction enzyme sites were selected 
for engineering into the region of the coding sequence 
of lacZot. These 13 restriction enzyme sites, together 
with the recognition sequence for Espl, were introduced 

15 into lacZot by site-directed mutagenesis using the 

mutagenic oligonucleotides NV1P (SEQ ID NO: 2) and NV2P 
(SEQ ID NO: 3) to generate the plasmid pSNS448. 

More specifically, pSNS416 was subjected to site- 
directed mutagenesis using a closing oligonucleotide 

20 method described previously (Slilaty et al . , 1990, Anal. 
Biochem. 185:194-200) and mutagenic oligonucleotide NV1P 
(SEQ ID NO: 2) to generate the plasmid pSNS432. Briefly, 
0.1 pmol of pSNS416 template DNA was mixed with 2 pmol 
of closing oligonucleotide and 10 pmol of NV1P, in a 

25 final volume of 22 fil. To this mixture was added 3 fil 
of annealing buffer (200 mM Tris-HCl, pH 7.4, 20 mM 
MgCl 2 , and 500 mM NaCl) , and then incubated in a boiling 
water bath for 3 minutes. The mixture was then 
incubated on ice for 2 to 8 minutes, followed by the 

3 0 addition of 3 /xl of DNA synthesis buffer (300 mM Tris- 
HCl, pH 7.8, 80 mM MgCl 2 , 100 mM DTT, lOmM ATP, 5mM of 
each of dGTP, dATP, dCTP, and dTTP, and 500 ^g/ml bovine 
serum albumin) , 1 /xl of T4 DNA ligase (1 unit//xl) , and 1 
fil of Klenow polymerase (7 units/ jxl) , with subsequent 

35 incubation on ice for an additional 30 minutes. The 
reactions were then sequentially incubated at room 
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temperature for 3 0 minutes, and 37°C for 60 minutes. 
The reactions may then be used to transform competent 
cells, with subsequent screening for the desired 
construction. The resultant plasmid, pSNS432, comprised 
5 restoration of the lacZot reading frame and creation of 
sites for the restriction enzymes Bell, Espl, Pstl, 
Nrul, Smal/Xzaal, Pvull, Clal, and Fspl by the 
introduction of base changes that do not affect the 
original coding capacity of the DNA. 
10 The coding sequence of lacZot in pSNS432 was further 

modified by a subsequent site-directed mutagenesis 
experiment using methods as described above and the 
mutagenic oligonucleotide NV2P (SEQ ID NO: 3) to generate 
the plasmid pSNS448. 
15 The silent nucleotide substitutions introduced into 

the coding sequence of lacZot by NV2P created restriction 
sites for restriction enzymes Nhel, EcoRl, BssHI, StuI, 
Bgrlll, and DraZ. Thus, pSNS448 contains sites for the 
restriction enzymes Bell, Espl, PstI, Nrul, Smal, Pvull, 
2 0 Clal, Fspl, Nhel, EcoRl, BssUl, Stul, Bgrlll, and Dral at 
codons 4, 8,-11, 15, 20, 24, 27, 30, 36, 39, 44, 47, 54, 
and 55 of lacZot, respectively. Accordingly, one 
embodiment of a plasmid vector according to the present 
invention is illustrated by pSNS448. More specifically, 
25 one embodiment of a plasmid according to the present 
invention comprises a base plasmid vector having a 
coding sequence of lacZot having multiple cloning sites 
contained in the region corresponding to codons 8 
through 38 of lacZ, as illustrated in FIG. 2, which 
30 corresponds to nucleotide position 112 to nucleotide 

position 204 of SEQ ID NO: 7. Additionally, the modified 
lacZot coding sequence of pSNS448 may be used as the 
progenitor for embodiments of other vectors according to 
the present invention, by using standard molecular 
35 biologic techniques, including for the vector 
embodiments pTrueBlue™ and M13TrueBlue™. 




WO 98/50566 j PCT/US98/08854 

- 22 - 

It will be recognized by those skilled in the art 
that the method of the present invention can be readily 

i. 

applied to genes of gene fragments other than lacZ or 
lacZot to generate color indicator cloning vectors (e.g. 
5 a GFP-based vector, Cubitt et al . , 1995, supra.; herein 
incorporated by reference) or positive selection cloning 
vectors (e.g. ccdB-based vector, Bernard et al., 1994, 
supra, herein incorporated by reference, or a GAtA-l- 
based vector, Trudel et al . , 1996, supra, herein 
10 incorporated by reference) , having characteristics and 
accuracy similar to those of the lacZot- based vectors 
described herein. 

EXAMPLE 2 

15 Illustrated in this example are methods and 

compositions for construction of another embodiment of a 
plasmid vector according to the present invention. 
pSNS448, containing a modified lacZot coding sequence, 
was further modified in regions upstream and downstream 

20 of the coding sequence of lacZot using the closing 

oligonucleotide method described above, and mutagenic 
oligonucleotides. The further modifications were 
designed to add other features useful for protein 
expression and other molecular manipulations. One or 

25 more of the further modifications may be used to achieve 
a plasmid vector according to the present invention. 
For example, to sequences 5' of the lacZot coding 
sequence in pSNS448, the mutagenic oligonucleotide NV5'P 
(SEQ ID NO: 4) was used to create the plasmid pSNS457 by 

30 looping- in sequences for an optimized ribosomal binding 
region (Gold and Stormo, 1990, Methods Enzymol . 185:89- 
90; see nucleotide positions 35 to 46 of SEQ ID N0:7); 
the restriction endonucleases Ncol and a Sail (see 
nucleotide positions 47 to 52, and 52 to 57, 

35 respectively of SEQ ID N0:7) ; a phage promoter (e.g., T7 
promoter; Schenborn and Mierendorf, 1985, Nucl . Acids 
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Res. 13:6223-6236; see nucleotide positions 55 to 76 of 
SEQ ID NO: 7); and restriction enzymes SflX, 
ApaX/BspX2QX , KpnX/Acc65X, BaxriHI and Xhol (see 
nucleotide positions 77 to 89, 85 to 90, 91 to 96, 98 to 
5 103, and 103 to 108, respectively of SEQ ID N0:7). The 
resultant plasmid, pSNS457, is another embodiment of the 
plasmid vector according to the present invention. * 

Additional modifications may be made in sequence 3' 
to the lacZa coding sequence. For example, mutagenic 

10 oligonucleotides such as those illustrated in SEQ ID 
NO: 5 and SEQ ID NO: 6 may be used sequentially with 
plasmid pSNS457 to generate the plasmid pSNS524 by 
adding restriction sites for the restriction enzymes 
Hindu I, BstBX, MluX, NsiX/PpulQX, SacX/ Ecll36XX, PacX, 

15 BspEX and'Xbal; the rho- independent trpA transcription 
terminator (Christie et al., 1981, Proc. Natl. Acad. 
Sci. USA 78:4180); and an AflXX site (as illustrated in 
FIG. 2B) . The plasmid pSNS524 is another embodiment of 
the plasmid vector according to the present invention. 

20 In making the different embodiments, of the plasmid 
vector according to the present invention, it may be 
desirable to substitute one restriction enzyme site for 
another or introduce new ones . For example, the PvuIT 
site (CAGCTG) between SmaX site and the Clal site (see 

25 Example 1) in pSNS524 may be converted to a MunX site 
(CAATTG) using adapter insertion or other methodology 
known to those skilled in the art to generate the 
plasmid pSNS527, referred to herein as pTrueBlue™ and 
illustrated in Figures 2A and 2B. Additionally, in 

30 another embodiment a phage promoter different than or 
the same as that located in the sequences 5 ' of the 
lacZot coding sequence, and in opposite orientation to 
that of the lac promoter (e.g., SP6 promoter; Schenborn 
and Mierendorf , 1985, supra) may also be added to the 

35 sequences 3' of the lacZa coding sequence. 
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In making the different embodiments of the plasmid 
according to the present invention, it may be desirable 
to confirm the intended modifications by DNA sequencing 
of both strands of the modified region using the dideoxy 
5 chain termination method or other standard method of DNA 
sequencing known in the art . In summary of Examples 1 , 
and 2 , one embodiment of the plasmid according to the 
present invention comprises at least one promote!: (e.g., 
lac promoter, or other promoter depending on the host 

10 cell system) operatively linked to a DNA sequence 

encoding an a-peptide; multiple cloning sites consisting 
of restriction sites, cleavable by distinct restriction 
enzymes, which have been introduced into and are 
contained within a region of the DNA sequence encoding 

15 the a-peptide, wherein the region corresponds to the DNA 
encoding amino acids 8 to 3 8 as shown in SEQ ID NO:l; 
and a replicon. The plasmid vector according to the 
present invention may further comprise at least one 
additional feature, located outside the lacZa encoding 

20 sequence, selected from the group consisting of an 

antibiotic resistance gene, a ribosomal binding region, 
a transcription terminator (for stable clones and high- 
level protein expression, see, e .g. , nucleotide 
positions 335 to 365 of SEQ ID NO:7), at least one phage 

25 promoter (for preparation of RNA probes in vitro) ; one 
or more restriction sites comprising an eight -base 
recognition sequence (e.g. for mapping and manipulation 
of large inserts) ; at least one restriction site for an 
endonuclease that generates ExoIII resistant 3' 

30 overhangs (for creating unidirectional deletions; see, 

e.g., nucleotide positions 287 to 298 of SEQ ID NO:7), a 
phage origin of replication (e.g., fl, Short et al . , 
1988, supra; Alting-Mees and Short, 1989, supra; see, 
e.g., FIG. 2) inserted in the opposite orientation to 

35 that of lacZot coding sequence (thereby facilitating the 
design of mutagenic, sequencing and other 
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oligonucleotides by allowing recovery of the antisense 
strand) , and combinations thereof. 

EXAMPLE 3 

5 Illustrated in this example are methods and 

compositions for construction of one embodiment of a 
phage vector according to the present invention . The * 
phage vector according to the present invention contains 
one or more cloning sites consisting of restriction 
10 sites cleavable by distinct restriction enzymes within a 
region of the DNA sequence encoding the a-peptide, 
wherein the region containing the sites corresponds to 
the DNA encoding amino acids 8 to 3 8 as shown in SEQ ID 
NO:l. An M13 phage version, containing a IslcZql coding 
15 sequence, was constructed by replacing the original 
promoter- lacZa coding sequences comprising 548 base 
pairs between PvuII and Bsu3 6I in M13mpl9 (Yanish- Perron 
et al., 1985, supra) with the modified lacZar coding 
sequences (268 base pairs) from pSNS448 described above. 
20 In one illustration of this method, an As?eI-BgrlII 

restriction fragment (from just upstream of the lac 
promoter to about codon 54) was removed from pSNS448. 
This fragment and the M13mpl9 restricted with PvuII and 
Bsu3 6I were filled- in to form blunt ends using the 
25 standard methods known to those skilled in the art, and 
using Kl enow fragment of DNA Polymerase I in a buffer 
containing all four nucleotides. Following the fill-in 
reactions, the fragment and restricted M13mpl9 were 
blunt-end ligated. One of the resultant phage isolates, 
30 which was designated M13sp3, contained the modified 

IstcZoi fragment (modified with the multiple restriction 
enzyme sites in accordance with the present invention) 
in the same orientation as the original lacZoi in 
M13mpl9 . Phage M13sp3 is one embodiment of the phage 
35 vector according to the present invention. 
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Phage M13sp3 was used as a template for further 
modifications by subsequent site-directed mutagenesis. 
In one modification reaction using the methods as 
described above, a mutagenic oligonucleotide (SEQ ID 
5 NO:8) was used to destroy the first of two Clal sites 
within the M13 genome at position 2,527 of the 
conventional M13mpl9 map (Yanish- Perron et al . , 1985 , 
supra.). The resultant phage, Ml3sp5 # is another 
embodiment of the phage vector according to the present 

10 invention. Phage M13sp5 was used as a template in a 

further site -directed mutagenesis reaction employing a 
mutagenic oligonucleotide (SEQ ID NO: 9) to destroy the 
second Clal site at position 6,882 of Ml3mpl9 in 
generating the phage M13sp7. In making the different 

15 embodiments, of the plasmid vector according to the 

present invention, it may be desirable to substitute one 
restriction enzyme site for another or introduce new 
ones. For example, the PvuII site (CAGCTG) between Smal 
site and the Clal site (see Example 1) in M13sp7 may be 

20 converted to a Muni site (CAATTG) using adapter 

insertion or other methodology known to those skilled in 
the art to' generate the M13spl3. M13spl3 is another 
embodiment of the phage vector according to the present 
invention, a schematic map of which is illustrated as 

25 M13TrueBlue in FIGs . 3A and 3B, and relevant sequence 
(lac promoter and modified lacZo. coding sequence) of 
which is shown in SEQ ID NO: 10. 

Although it is possible to clone large DNA 
fragments in M13, large inserts are known to be unstable 

3 0 (see, e.g., Messing, 1983, supra; Yanisch- Perron et al . , 
1985, supra). It is noted that replacing a 548 base 
pair fragment containing lacZoi coding sequences with the 
modified lacZa coding sequences (268 base pairs) results 
in a 280 base pair reduction in size of the vector. 

35 Thus, an additional advantage of the phage vector 

according to the present invention is that it would add 
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to the stability of DNA inserts, as compared to the 
currently used M13 phage. 

EXAMPLE 4 

5 Illustrated in this example are methods and 

compositions for construction of one embodiment of a 
bacterial artificial chromosome vector according to tfie 
present invention. The bacterial artificial chromosome 
vector according to the present invention contains one 

10 or more cloning sites consisting of restriction sites 
cleavable by distinct restriction enzymes within a 
region of the DNA sequence encoding the af-peptide, 
wherein the region containing the sites corresponds to 
the DNA encoding amino acids 8 to 38 as shown in SEQ ID 

15 N0:1. A bacterial artificial chromosome embodiment of 
the present invention was constructed by replacing the 
original promoter- lacZot coding sequences comprising 
approximately 630 base pairs between NotX and Sfil in 
pBeloBACll (Shizuya et al . , 1992, Proc. Natl. Acad. Sci 

20 USA 89:8794-8797) with the modified lacZa coding 
sequences from pSNS524 described above. In one 
illustration of this method, an approximately 423 base 
pair restriction fragment from Asel just upstream of the 
lac promoter to the AfMII restriction site just 

25 downstream of the transcription terminator was removed 
from pSNS524. This fragment and the pBeloBACll DNA 
restricted with Notl and Sfil were filled- in to form 
blunt ends using the standard methods known to those 
skilled in the art, and using the Klenow fragment of DNA 

30 polymerase I in a buffer containing all four 

nucleotides. Following the fill-in reactions, the 
fragment and restricted pBeloBACll were blunt -end 
ligated. One of the resultant isolates, which was 
designated pSNS528, contained the modified lacZot 

35 fragment (modified with the multiple restriction enzyme 
sites in accordance with the present invention) in the 
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same orientation as original lacZot in pBfeloBACll. The 
plasmid, pSNS52 8, is one embodiment of the bacterial 
artificial chromosome vector according to the present 
invention and is illustrated as TrueBlue-BAC™ in Figures 
5 4A and 4B, and relevant sequence (lac promoter and 

modified lacZa coding sequence) of which is shown in SEQ 
ID NO: 11;. 

EXAMPLE 5 

10 Illustrated in this example is the efficiency of 

color selection using the modified lacZot gene fragment 
(coding sequence) according to the present invention, 
and methods and compositions for testing the same. In 
one method for evaluating the efficiency of color 

15 selection in the modified lacZot gene fragment, a two to 
four base pair insertion or deletion was created at each 
of the newly engineered restriction sites in the 
modified lacZot coding sequence. .This was accomplished 
by cleavage of pSNS448 DNA, described above and 

20 representing the modified lacZot coding sequence found in 
the vectors according to the present invention, with 
different restriction enzymes followed by filling- in or 
recessing the DNA overhangs by treatment with the Klenow 
enzyme and religation of the blunt termini. These 

25 manipulations resulted in the formation of lacZot mutants 
in which the reading frame had been shifted at the site 
of restriction enzyme cleavage. Shifting the reading 
frame is what would be expected by the insertion of a 
DNA molecule at that restriction enzyme site. 

30 Transformation of the DNA molecules produced in this 

fashion into an indicator host strain of E. coll yielded 
both white and blue colonies. The proportion of white 
colonies observed for each restriction site is an 
indication of the importance of the coding sequence 

35 upstream of that site in producing a functional 

a-peptide. It is noted that using this method of 
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analysis, some blue colonies will always result due to 
incomplete reactions at the restriction digestion and 
fill -in/recession steps. In fact, control reactions in 
which ligase was omitted yielded essentially 100% blue 
5 colonies (e.g., unrestricted vector) . The results of 
this analysis, shown in Table 1, delineate a region 
where interruption of the lacZa coding sequence leads * to 
the formation of a non- functional of-peptide. As shown 
in Table 1, this region includes the Espl site at codon 

10 8 to the KcoRI site at codon 39, and does not include 
the sequences upstream of codon 7 of the IslcZql coding 
sequence which are used by the currently available lacZa 
vectors for the cloning of. DNA inserts. Filling- in of 
the BssHII site at codon 44 resulted in only 3% white 

15 colonies, indicating the end of the region of the lacZa 
coding sequence that is essential for producing a 
functional of-peptide (see also FIG. 2B) . Also shown in 
Table 1 is the number of readthrough amino acids 
resulting from the shift in the reading frame. 
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Table 1 





Restrictio 


lacZ 


Read- 


# Blue 


# white 


% white 




n 


a. 


through 


colonie 


colonie 


colonie 






codo 
n 




9 


s 


s 




JBspI 


8 


n.a. 


531 


42 




10 


PstI 


11 


12 


102 


123 


55 


JEmal 


20 


16 


776 . 


819 


51 


Clal 


27 


1 • 


124 


470 


79 


J\7hel 


36 


0 


164 


767 


82 


£?CORI 


39 


0 


104 


475 


82 


BssHII 


44 


3 


234 


8 


3 


Bgrlll 


54 


12 • 


448 


283 


39 



15 n.a.- denotes "not applicable", as reading frame shift 
did not result. 

A second method, used to investigate the mechanism 
of color selection in the modified lacZot coding 

20 sequence, involved insertion of a DNA molecule into each 
of the newly created restriction enzyme sites. Random 
fragments of X phage DNA were shotgunned into the 
various newly introduced restriction sites in lacZot and 
the resultant colonies or plaques were sampled and 

25 analyzed for the presence or absence of DNA molecule 
inserts. Bacteriophage X DNA was digested with PstI, 
Mspl, Apol, BssHII or Bs tYI and the resulting fragments 
were cloned by shotgun ligation into pSNS448 DNA or 
functionally equivalent plasmid, described above and 

30 representing the modified lacZot coding sequence found in 
the vectors according to the present invention, which 
had been linearized by cleavage with PstI, Clal, ScoRI, 
BssHII, or Bgrlll or BairiHI, respectively. For pSNS448 
DNA or functionally equivalent plasmid linearized with 

35 the blunt -end cutting enzymes Nrul, Smal and StuI, or 
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blunted by fill-in as described above for the NheX and 

EspT sites, X DNA digested with Haelll was used. 

Similarly, X DNA digested with Haelll was used to 

perform random insertions into M13TrueBlue™ replicative 

5 form DNA which had been linearized by cleavage with Smal 

or Fspl. When M13TrueBlue™ DNA had been linearized with 

♦ 

Clal, X DNA digested with AtepI was employed. Following 
transformation of E. coli host strain HB2151 and plating 
of cells onto media containing X-gal and IPTG, blue 
10 colonies and blue plaques were grown and the plasmid DNA 
or replicative form M13 phage DNA was isolated and 
screened for carrying an insert within the modified 
lacZot gene fragment by cleavage with BairiHI plus Hindlll 
for plasmid DNA, and Aval plus ScoRI for M13 phage DNA. 

Since false negatives (blue colonies or blue 
plaques harboring vectors with DNA inserts) are far more 
problematic than false positives in terms of their 
contribution to errors in screening for recombinant 

20 clones, analysis of lacZot insertional inactivation was 
focused almost entirely on understanding the structure 
of the plasmid carried by blue colonies or M13 phage DNA 
carried by blue plaques. Table 2 shows the results of 
analysis of the plasmid isolated from blue colonies for 

25 the presence of DNA molecule inserts, and Table 3 shows 
the results of analysis of the M13 phage DNA isolated 
blue plaques for the presence of DNA molecule inserts, 
it is evident from these results that blue colonies and 
blue plaques correctly reflect the structure of the 

3 0 respective vector they are carrying only when insertion 
of a DNA molecule took place within the codons encoding 
the structurally essential elements of the a-peptide 
(i.e., codons 8 to 38). When insertion of the DNA 
molecule was attempted upstream or downstream of this 

35 essential region, false negatives arose at high 
frequencies (see, e.g., Table 2). 
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Table 2 

Analysis of insertions into lacZot in plasmid DNA of blue 

i 

colonies 



restrictio 
n 

enzyme 


lacZot 
cddon 


# insert 
positive 


# insert 
negative 


% insert 
negative 


BcII/BamHI 


4 


16 


22 




Espl 


8 


4 


11 


73% 1 


PstI 


11 


0 


15 


100% 


Nrul 


15 


0 


12 


100% 


Sinai 


20 


0 


15 


100% 


Clal 


27 


1* 


54 


98% 


Fspl 


30 


n.d. 


n.d. 


n.a. 


Nhel 


36 


0 


32 


100% 


EcoRl 


39 


12 


15 


56% 


BssKXX 


44 


n.d. 


n.d. 


n.a. 


StuI 


47 


7 


2 


22% 




54 


6 


3 


33% I 



*-plasmid dimer anomaly; n.d.- not determined; n.a.- not 
applicable 
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Table 3 

Analysis of insertions into lacZot in M13 phage 
DNA of blue plaques 

5 



restrictio 
n 

enzyme 


lacZa 
codon 


# insert 
positive 


# insert 
negative 


% lnseiL 
negative 


EspT 


8 


n.d. 


n.d. 


n.a. 


Ps tl 


11 


n.d. ■ 


n.d. 


n.a. 


Nrul 


15 


n.d. 


n.d. 


n.a. 


SmaX 


20 


0 


48 


100% 


Clal 


27 


0 


43 


100% 


Fspl 


30 


0 


34 


100% 


Nhel 


36 


n.d. 


n.d. 


n.a. 


EcoRI 


39 


n.d. 


n.d. 


n.a. 


BssHll 


44 


n.d. 


n.d. 


n.a. 


StuI 


47 


n.d. 


n.d. 


n.a. 


Bsrlii 


54 


n.d. 


n.d. 


n.a. 



20 

n.d.- not determined; n.a.- not applicable 

In summary, the results of the fill -in/recession 
studies outlined in Table 1, and the insertional 

25 inactivation experiments detailed in Tables 2 and 3, 
collectively define a region of the lacZot coding 
sequence where reliable color selection as well as 
virtual absence of false negatives can be achieved. 
This region extends from the Kspl site at codon 8 

3 0 through the Nhel site at codon 36 , and to codon 38. The 
result obtained for the next restriction site, EcoRI at 
codon 39, were mixed. While the fill-in data suggest 
that this site is essential for a-peptide function (see 
Table 1) , the insertional inactivation data suggest 

35 otherwise (see Table 2) . It is possible therefore, that 
the region of accuracy extends through the EcoRI site at 
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codon 39 down to codon 43 just upstream of the BssHII 
site at codon 44 where the end of the essential region 
of the a-peptide is clearly marked by the concurrence of 
both types of data (see Tables 1 and 2) . One of the 
5 most important characteristics of this region of lacZa 
coding sequence is its ability to virtually eliminate 
the generation of false negatives. In fact, out of a 
total of 308 blue colonies or blue plaques resulting 
from cloning experiments performed in this region, only 

10 one was found to carry an insert (along with a second 
intact copy of the lacZot gene fragment, e.g., the 
plasmid dimer anomaly denoted in Table 2) . This region 
of the lacZot coding sequence therefore, together with 
the 10 illustrated restriction enzyme sites, EspT, PstI, 

15 Nrul, Smal/Xmal, PvuXl/Munl, Clal, FspT and Miel, 
provides a first opportunity for performing color 
selection cloning where the probability of false 
negative events is at virtual zero. This is 
particularly important for development of ordered 

20 genomic libraries and shotgun DNA sequencing procedures 
where blue colonies or blue plaques which could contain 
DNA fragments essential for formation of a complete 
"contig" are not analyzed. 

25 EXAMPLE 6 

Illustrated in this embodiment are methods for 
using a 

vector according to the present invention, wherein the 
method comprises using at least one restriction enzyme 

30 site, within a modified lacZot coding sequence, to clone 
a DNA molecule. One illustration of the method of using 
a vector according to the present invention, wherein the 
vector comprises a marker inactivation system utilizing 
a modified lacZot coding sequence, comprises cloning 

35 (directionally or nondirectionally) a DNA molecule into 
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a single restriction enzyme site in the region of the 
modified lacZoc coding sequence, corresponding to DNA 
sequence encoding amino acids 8 to 38 of /3-galactosidase 
as illustrated in SEQ ID NO:l, in forming recombinant 
5 vectors. For example, the DNA molecule may have Clal 
compatible ends, and then be cloned into the Clal site 
in the modified lacZct coding sequence; followed by 4 
introducing the resultant recombinant vectors into 
competent host cells; growing the host cells in the 

10 presence of a chromogenic substrate cleavable by 

/3-galactosidase ; and screening for indicia of lac operon 
marker inactivation selected from the group consisting 
of white colonies (if a plasmid or a bacterial 
artificial chromosome vector is used) , clear plaques (if 

15 a phage vector is used) , and lack of cell -staining (if a 
vector for eukaryotic cells is used) . The method may 
further comprise adding an inducer of lacZa gene 
expression when the host cells are grown in the presence 
of a chromogenic indicator for jS-galactosidase activity 

20 such as x-gal or MacConkey agar. 

Another illustration of the method of using a 
vector according to the present invention, wherein the 
vector comprises a marker inactivation system utilizing 
a modified lacZoi coding sequence, comprises cloning 

25 (directionally or nondirectionally) of a DNA molecule 
into two restriction enzyme sites in a region of the 
modified lacZoi coding sequence, corresponding to DNA 
sequence encoding amino acids 8 to 3 8 of /8-galactosidase 
as illustrated in SEQ ID N0:l, in forming recombinant 

30 vectors. For example, the DNA molecule may have a PstX 
compatible end and an Miel compatible end, and then be 
cloned into the modified lacZot coding sequence which had 
been restricted with PstI and Nhel; followed by 
introducing the resultant recombinant vectors into 

35 competent host cells; growing the host cells in the 
presence of a chromogenic substrate cleavable by 
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/3-galactosidase; and screening for indicia of lac operon 
marker inactivation selected from the group consisting 
of white colonies (if a plasmid or a bacterial 
artificial chromosome vector is used) , clear plaques (if 
5 a phage vector is used) . , and lack of cell -staining (if a 
vector for eukaryotic cells is used) . The method may • 
further comprise adding an inducer of lacZot gene 
expression when the host cells are grown in the' presence 
of a chromogenic substrate or indicator for 
10 /3-galactosidase activity such as x-gal or MacConkey 
agar. 

Another method of using a vector according to the 
present invention, wherein the vector comprises a marker 
inactivation system utilizing a modified lacZot coding 

15 sequence, comprises cloning (directionally or 

nondirectionally) of a DNA molecule into a restriction 
enzyme site in a region of the modified lacZot coding 
sequence, corresponding to DNA sequence encoding amino 
acids 8 to 38 of /3-galactosidase as illustrated in SEQ 

20 ID NO : 1 , and a restriction enzyme site (either in the 
la.cZ encoding sequence or vector sequence) which is 
upstream of such region of the modified lacZot coding 
sequence in forming recombinant vectors. For example, 
and with reference to FIG. 2B, the DNA molecule may have 

25 a BamHI compatible end and an -Xmal compatible end, and 
then be cloned into a vector cleaved at a BamHI site 
upstream of the codon for amino acid 8, and cleaved at 
the -Xmal site; followed by introducing the resultant 
recombinant vectors into competent host cells; growing 

3 0 the host cells in the presence of a chromogenic 

substrate cleavable by /3-galactosidase; and screening 
for indicia of lac operon marker inactivation selected 
from the group consisting of white colonies (if a 
plasmid or a bacterial artificial chromosome vector is 

35 used) , clear plaques (if a phage vector is used) , and 

lack of cell -staining (if a vector for eukaryotic cells 
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is used) The method may further comprise adding an 
inducer of lacZot gene expression when the host cells are 
grown in the presence of a chromogenic substrate 
cleavable by 0-galactosidase . 
5 A further method of using a vector according to the 

present invention, wherein the vector comprises a marker 
inactivation system utilizing a modified lacZot coding* 
sequence, comprises cloning (directionally or 
nondirectionally) of a DNA molecule into a restriction 
10 enzyme site in a region of the modified lacZot coding 

sequence, corresponding to DNA sequence encoding amino 
acids 8 to 38 of 0-galactosidase as illustrated in SEQ 
ID NO:l, and a restriction enzyme site (in the la.cZ 
coding sequence or in the vector sequence) which is 
15 downstream of such region of the modified lacZot coding 
sequence in forming recombinant vectors . For example, 
and with reference to FIG. 2B, the DNA molecule may have 
a Bgrlll compatible end and a Nrul compatible end, and 
then be cloned into a vector cleaved at a Bgrlll site 
20 downstream of the codon for amino acid 38, and cleaved 
at the Nrul site in the region of codons 8 to 38 of the 
modified lacZot coding sequence; followed by introducing 
the resultant recombinant vectors into competent host 
cells; growing the host cells in the presence of a 
25 chromogenic substrate cleavable by /J-galactosidase; and 
screening for indicia of lac operon marker inactivation 
selected from the group consisting of white colonies (if 
a plasmid vector is used) , clear plaques (if a phage 
vector is used), and lack of cell -staining (if a vector 
30 for eukaryotic cells is used). The method may further 

comprise adding an inducer of lacZot gene expression when 
the host cells are grown in the presence of a 
chromogenic substrate cleavable by /3-galactosidase . 

An additional illustration of the method of using a 
35 vector according to the present invention, wherein the 
vector comprises a marker inactivation system utilizing 
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a modified IclcZol coding sequence, comprises cloning 
(directionally or nondirectionally ) of a DNA molecule 
into a restriction enzyme site in the la.cZ coding region 
or in the vector sequence which is upstream of a region 
5 of the modified IslcZql coding sequence that corresponds 
to DNA sequence encoding amino acids 8 to 38 of 
/3-galactosidase as illustrated in SEQ ID N0:1, and a 
restriction enzyme site downstream of such region but 
still within the modified lacZoi coding sequence, in 

10 forming recombinant vectors. For example, and with 

reference to FIG. 3B, the DNA molecule may have a Bell 
compatible end and a StuX compatible end, and then be 
cloned into a vector cleaved at a Belli site upstream of 
the codon for amino acid 8, and cleaved at the Stul site 

15 downstream of the region between codon 8 and 38 but 

still at a restriction site engineered into the sequence 
of the modified lacZo. coding sequence; followed by 
introducing the resultant recombinant vectors into 
competent host cells; growing the host cells in the 

20 presence of a chromogenic substrate cleavable by 

/3-galactosidase; and screening for indicia of lac operon 
marker inaCtivation selected from the group consisting 
of white colonies (if a plasmid vector is used) , clear 
plaques (if a phage vector is used) , and lack of cell- 

25 staining (if a vector for eukaryotic cells is used) . 
The method may further comprise adding an inducer of 
IclcZol gene expression when the host cells are grown in 
the presence of a chromogenic substrate cleavable by 
/3-galactosidase . 

30 From the foregoing, it will be obvious to those 

skilled in the art that various modifications in the 
above -described methods, and vector constructs can be 
made without departing from the spirit and scope of the 
invention. Accordingly, the invention may be embodied 

35 in other specific f orms without departing from the 

spirit or essential characteristics thereof. Present 
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embodiments and examples, therefore, are to be 
considered in all respects as illustrative and not 
restrictive, and all changes which come within the 
meaning and range of equivalency of the claims are 
5 therefore intended to be embraced therein. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANTS: Slilaty, N. Steve 

Lebel , Suzanne 

(ii) TITLE OF INVENTION: Modified lacZa Coding Sequences 

And Uses Thereof 

(iii) NUMBER OF SEQUENCES: 10 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Hodgson, Russ, Andrews, Woods & Goodyear 

(B) STREET: 1800 One M&T Plaza 

(C) CITY: Buffalo 

(D) STATE: New York 

(E) COUNTRY: United States 

(F) ZIP: 14203-2391 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.5 inch 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: MS-DOS/ Microsoft Windows 

(D) SOFTWARE : Wordperfect for Windows 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 1 May 1998 

(vii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Kadle, Ran j ana 

(B) REGISTRATION NUMBER: 40,041 

(C) REFERENCE DOCKET NUMBER: 24945.0001 

(viii) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (716) 856-4000 

(B) TELEFAX: (716) 849-0349 

(2) INFORMATION FOR SEQ ID NO:l : , 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 

(B) TYPE: amino acid 

(C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

MetThrMetlleThrAspSerLexiAlaValValLeuGlnArgArgAspTrpGlxoAsnPro 
1 5 10 15 20 

GlyValThrGlnLeuAsnArgLeuAlaAlaHisProProPheAlaSerTrpArgAsnSer 
21 25 30 35 40 



GluGluAlaArgThrAspArgProSerGlnGlnLexiArgSerLeuAsnGlyGluTrpArg 
41 45 50 55 60 
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(3) INFORMATION FOR SEQ ID NO: 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 99 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single- stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: No 

(iv) SEQUENCE DESCRIPTION: SEQ ID NO: 2 : 

TGACCATGAT CACGGACAGC TTAGCCGTCG TTCTGCAGCG TCGCGACTGG 50 
GAAAACCCGG GCGTTACCCA GCTGAATCGA TTAGCTG CGC ATCCCCCTT 99 

(4) INFORMATION FOR SEQ ID NO: 3 : 

(ij SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 92 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single- stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

( iii ) HYPOTHETICAL : No 

(iv) SEQUENCE DESCRIPTION: SEQ ID NO: 3 : 

CGCATCCCCC ATTCG CT AGC TGGCGGAATT CCGAAGAGGC GCGCACCGAT 50 
AGGCCTTCCC AACAGTTGAG ATCTTTAAAT GGCGAATGGC GC 92 

(5) INFORMATION FOR SEQ ID NO: 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single- stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: No 

(iv) SEQUENCE DESCRIPTION: SEQ ID NO: 4 : 

CAATTTCACA CAGGAGGAAA AAACCATGGT CGACTTAATA CGACTCACTA 50 
TAGGGCCTTA TGGGCCCGGT ACCCGGATCC TCGAGAGCTT AGCCGTCGTT 100 

(6) INFORMATION FOR SEQ ID NO: 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single - stranded 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: No > 

(iv) SEQUENCE DESCRIPTION: SEQ ID NO: 5 : 

TTAAATGGCG AATGGCGGTA AGCTTCGAAC GCGTATGCAT GAGCTCTTAA 50 
TTAACTCCGG ATAAATTGTA AGCGTTAA 78 

(7) INFORMATION POR SEQ ID NO : 6 : 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 74 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single- stranded 

(D) TOPOLOGY : linear 

( i i ) MOLECULE TYPE : DNA 

(iii) HYPOTHETICAL: No 

(iv) SEQUENCE DESCRIPTION: SEQ ID NO: 6 : 

AGCTCTTAAT TAACTCCGGA TCTAGAGCCC GCCTAATGAG CGGGCTTTTT 50 
TTTCTTAAGT AAATTGTAAG CGTT 74 

(8) INFORMATION FOR SEQ ID NO: 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 372 nucleotides 

(B) TYPE: nucleic acid 

(.C) STRANDEDNESS : double - stranded 
(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: No 

(iv) FEATURE: relevant portion of circular molecule listed 

(v) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAGG AAAAAACC ATG 51 
GTC GAC TTA ATA CGA CTC ACT ATA GGG CCT TAT GGG CCC GGT 93 
ACC CGG ATC CTC GAG AGC TTA GCC GTC GTT CTG CAG CGT CGC 135 
GAC TGG GAA AAC CCG GGC GTT ACC CAG CTG AAT CGA TTA GCT 177 
GCG CAT CCC CCA TTC GCT AGC TGG CGG AAT TCC GAA GAG GCG 219 
CGC ACC GAT AGG CCT TCC CAA CAG TTG AGA TCT TTA AAT GGC 261 



GAA TGG CGG TAA GCTTCGAACG CGTATGCATG AGCTCTTAAT 



303 



# 
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TAACTCCTCT AGAGCCCGCC TAATGAGCGG GCTTTTTTTT CTTAAGTAAA 353 



(9) INFORMATION FOR SEQ ID NO: 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single- stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: No 

(iv) SEQUENCE DESCRIPTION: SEQ ID NO: 8 : 
ACCAATGAAA CCATCTATAG C AG CACCGT A A 31 

(10) INFORMATION FOR SEQ ID NO: 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single- stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: No 

(iv) SEQUENCE DESCRIPTION: SEQ ID NO: 9 : 
GGAGCAAACA AGAGAGTCGA TGAACGGTAA T 31 

(11) INFORMATION FOR SEQ ID NO:10 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 249 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double- stranded or single- stranded 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: No 

(iv) FEATURE: relevant portion of circular molecule listed 

(v) SEQUENCE DESCRIPTION: SEQ ID NO: 10 : 

TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAAA CAGCT ATG ACC 51 
ATG ATC ACG GAC AGC TTA GCC GTC GTT CTG CAG CGT CGC GAC TGG 96 
GAA AAC CCG GGC GTT ACC CAG CTG AAT CGA TTA GCT GCG CAT CCC 141 



TTGTAAGCGT TAATATTTT 



372 



CCA TTC GCT AGC TGG CGG AAT TCC GAA GAG GCG CGC ACC GAT AGG 186 
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CCT TCC CAA CAG TTG AGA TGT GAG GCC GAT ACT GTC GTC GTC CCC 231 
TCA AAC TGG CAG ATG CAC . 249 

5 (12) INFORMATION SEQ ID NO: 11 : 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 375 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double- stranded or single- stranded 
10 (D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: No 

(iv) FEATURE: relevant portion of circular molecule listed 

(v) SEQUENCE DESCRIPTION: SEQ ID NO: 11 : 

15 

TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAGG AAAAAACCAT 50 
GGTCGACTTA ATACGACTCA CTATAGGGCC TTATGGGCCC GGTACCCGGA 100 
TCCTCGAGAG CTTAGCCGTC GTTCTGCAGC GTCGCGACTG GGAAAACCCG 150 
GGCGTTACCC AGCTGAATCG ATTAGCTGCG CATCCCCCAT TCG CTAGCTG 200 
2 0 GCGGAATTCC GAAGAGGCGC GCACCGATAG GCCTTCCCAA CAGTTGAGAT 250 
CTTTAAATGG CGAATGGCGG TAAGCTTCGA ACGCGTATGC ATGAGCTCTT 300 
AATTAACTCC GGATCTAGAG CCCGCCTAAT GAGCGGGCTT TTTTTTCTTA 350 
AGGC CG CATC GAATATAACT TCGTA 375 
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What is claimed is: 

1. A vector for cloning a DNA molecule, wherein the 
vector comprises at least one promoter operably linked 

5 to a modified lacZa coding sequence encoding an 

Qf-peptide and containing at least one cloning site 
cleavable by a distinct restriction enzyme which has 
been introduced within a lacZ coding sequence downstream 
of and including the codon for amino acid 8 of 
10 /3-galactosidase. 

2. The vector according to claim 1, wherein the at 
least one cloning site is introduced within a la.cZ 
coding sequence from and including the codon for amino 

15 acid 8 to and including the codon for amino acid 38 of 
j8 - galactosidase . 

3 . The vector according to claim 1, wherein the at 
least one cloning site is a six-base palindrome. 

20 

4. The vector according to claim 1, wherein the at 
least one cloning site is a restriction site selected 
from the group consisting of sites cleaved by the 
enzymes Espl, Pstl, NruX, Smal/Xmal, PvuII, Muni, Clal, 

25 Agrel, Fspl and .Mhel. 

5 . The vector according to claim 2 , wherein the at 
least one cloning site is a six-base palindrome. 

3 0 6. The vector according to claim 2, wherein the at 
least one cloning site is a restriction site selected 
from the group consisting of sites cleaved by the 
enzymes Espl, Pstl, Nrul, Smal/Xmal, PvuII , Muni, Clal, 
Agrel, Fspl and Nhel. 
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7. The vector according to claim 1, wherein the vector 
is a plasmid vector. 

8 . The plasmid vector according to claim 7 , wherein 
5 the plasmid vector is pSNS527. 

9. The vector according to claim 1, wherein the, vector 
is a bacterial artificial chromosome vector. 

10 10. The bacterial artificial chromosome vector 

according to claim 9, wherein the bacterial artificial 
chromosome vector is pSNS528. 

11. The plasmid vector according to claim 7, wherein 
15 the at least one cloning site is a six- base palindrome. 

12 . The plasmid vector according to claim 7 , wherein 
the at least one cloning site is a restriction site 
selected from the group consisting of sites cleaved by 

20 the following restriction enzymes: Espl, PstI, Nrul, 
Smal/XmaZ, Muni, CIslL , AgeT, Fspl and JDiel. 

13. The plasmid vector according to' claim 7, further 
comprising at least one additional feature, located 

25 outside the lacZa. coding sequence, selected from the 

group consisting of a replicon, an antibiotic resistance 
gene, a gene that functionally complements an auxotroph, 
a ribosomal binding region, a transcription terminator, 
at least one phage promoter, one or more restriction 

3 0 sites comprising an eight -base recognition sequence, at 
least one restriction site for an endonuclease that 
generates ExoIII resistant 3' overhangs, a phage origin 
of replication, and combinations thereof. 
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14 . The plasmid vector according to claim 13 , wherein 
the antibiotic resistance gene is a gene that confers 
resistance to an antibiotic selected from the group 
consisting of ampicillin, kanamycin, tetracycline, and 

5 chloramphenicol . . 

15. The vector according to claim 1, wherein the vector 
is a phage vector. 

10 16, The phage vector according to claim 15, wherein the 
phage vector is M13spl3. 

17. The phage vector according to claim 15, wherein the 
at least one cloning site is a six-base palindrome. 

15 

18. The phage vector according to claim 15, wherein the 
at least one cloning site is a restriction sites 
selected from the group consisting of sites cleaved by 
the enzymes : EspX , PstI, Nrul, SmaT/Xmal, PvxzII, Muni, 

20 Clal, Agrel, Fspl, and Itfhel. 

19. A kit comprising as a component the vector 
according to claim 1, and further comprises a component 
selected from the group consisting of host cells into 

25 which the vector is introduced, a chromogenic substrate 
or indicator for /3-galactosidase activity, an inducer of 
lacZoi gene expression, one or more restriction enzymes 
specific for restriction sites located within the 
vector, and combinations thereof. 

30 

20. A kit comprising as a component the vector 
according to claim 7, and further comprises a component 
selected from the group consisting of host cells into 
which the plasmid vector is introduced, a chromogenic 
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substrate or indicator for j3-galactosidase activity, an 
inducer of lacZoi gene expression, one or more 
restriction enzymes specific for restriction sites 
located within the vector, and combinations thereof* 

5 

21. A kit comprising as a component the vector 
according to claim 15, and further comprises a component 
selected from the group consisting of host cells into 
which the phage vector is introduced, a chromogenic 
10 substrate or indicator for /J-galactosidase activity, an 
inducer of lacZoi gene expression, one or more 
restriction enzymes specific for restriction sites 
located within the vector, and combinations thereof. 

15 22. A. method of using the vector according to claim 1, 

wherein the method comprises : 

(a) cloning a DNA molecule into the at least one 

cloning site in the modified lacZa coding sequence to 

form recombinant vectors; 
20 (b) introducing the recombinant vectors into host 

cells, wherein the host cells require or- complementation 

to produce j3-galactosidase activity; 

(c) growing the host cells in the presence of a 
chromogenic substrate or indicator for /3-galactosidase 

25 activity; and 

(d) screening for indicia of lac operon marker 
inactivation. 

23. The method according to claim 22, wherein the at 
30 least one cloning site has been introduced within a lacZ 
coding sequence from and including the codon for amino 
acid 8 to and including the codon for amino acid 38 of 
0-galactqsidase. 
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24! The method according to claim 22 , further 
comprising adding an inducer of lacZot gene expression 
when the host cells are grown in the presence of a 
chromogenic substrate or an indicator of /3-galactosidase 
5 activity. 

♦ 

25. The method according to claim 22 , wherein the 
method comprises cloning a DNA molecule into a single 
restriction enzyme site in the modified lacZot coding 

10 sequence. 

26. The method according to claim 22, wherein the 
method comprises cloning of a DNA molecule into two 
restriction enzyme sites in the modified lacZot coding 

15 sequence. 

27. The method according to claim 22, wherein the 
method comprises cloning of a DNA molecule into a first 
restriction enzyme site in the modified lacZot coding 

20 sequence; and a second restriction enzyme site which is 
upstream of the first restriction enzyme site, said 
second restriction enzyme site being located in a 
sequence selected from the group consisting of lacZ 
coding sequence, and vector sequence. 

25 

28. The method according to claim 22, wherein the 
method comprises cloning of a DNA molecule into a first 
restriction enzyme site in the modified lacZot coding 
sequence; and a second restriction enzyme site which is 

30 downstream of the first restriction enzyme site, said 
second restriction enzyme site being located in a 
sequence selected from the group consisting of lacZ 
coding sequence, and vector sequence. 
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29. The method according to claim 22, wherein the 
method comprises cloning of a DNA molecule into a first 
restriction enzyme site which is upstream of the 
modified lacZa coding sequence, said first restriction 

5 enzyme site being located in a sequence selected from 

the group consisting of lacZ coding sequence, and vector 
sequence; and a second restriction enzyme site which is 
downstream of the modified lacZa coding sequence', said 
second restriction enzyme site being located in a 
10 sequence selected from the group consisting of la.cZ 
coding sequence, and vector sequence. 

30. A method for modifying a gene for use as an 
indicator of insertion of a DNA molecule in a cloning 

15 vector comprising introducing at least one restriction 

enzyme site in a region of the gene that is required for 
the indicator activity of the protein encoded by the 
gene such that insertion of the DNA molecule in the 
region results in less than 10% false negatives. 
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5 10 
M T M I T D S LAV 
ATGACNATGATHACNGAYEEELLLGCNGTN 

TGATCA ( Bel I ) GCTAGC (Nhel ) 

AGTACT ( Seal ) 
GCTNAGC ( EsjdI ) 

15 2 0 

V L Q R R D W E N P 
GTNLLLCAPRRRRRRGAYTGGGAPAAYCCN 

TTGCAA (Unknown) CCCGGG ( Smal ) 

CTGCAG ( Ps tl) CCCGGG ( Smal ) 

TCGCGA(iVruI) 

25 30 
G V T Q L N R L A A 
GGNGTNACNCAPLLLAAYRRRLLLGCNGCN 

CAATTG(Munl) TGCGCA ( FgPl ) 

CAGCTGlPvnll) 

ACCGGT ( Agel ) 
ATCGAT ( Clal ) 

GCTAGC (Nhel) 

35 40 
HPPFASWR N S 
CAYCCNCCNTTYGCNEEETGGRRRAAYEEE 

TCGCGA {Nrul ) GAATTC ( ffco RI ) 
TTGCAA (Unknown) 
GCTAGC (Nhel) 
CAGCTG(PvuII) 



Fig. 1 



SUBSTITUTE SHEET (RULE 26) 



WO 98/50566 



PCT/US98/08854 



45 



EEARTDRPSQ 
GAPGAPGCNRRRACNGAYRRRCCNEEECAP 
GCGCGC jBssHIIJ 

CGTACG ( Pf 223 II ) 

CGATCG ( Pvul ) 

AGGCCT ( Stul ) 
CGGCCG (Eco52I) 



QL R SLNGEWR 
CAPLLLRRREEELLLAAYGGNGAPTGGRRR 
CAATTG (Muni ) 
CAGCTG ( Pvul I ) 

TCCGGA(Kpn2I) 
TGCGCA ( Fspl ) 
TTCGAA(AsuII) 
TACGTA ( SnaBI ) 
AGATCT ( Bal 1 1 ) 
CGATCG (Pvul) 

AAGCTT ( Hindi II) 
TTTAAA (Dra.1 ) 
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Fig. 1 (continued) 
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